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(57) Abstract 

Initially, a first template (58), having a first pattern similar to one of the distinctive features of the object (60), is passed over the video 
field and compared to it in order to preliminarily identify at least one possible distinctive feature as a candidate. A second template (80) 
is then created by toking one of the major elements of the distinctive feature candidate and extending that element all the way across the 
second template (80) and then comparing it to the distinctive feature candidate to eliminate one or more possibly falsely identified features 
A third template (102) is then created having a pattern formed from another major element of said distinctive feature and extending it all 
the way across the third template (102). which is then likewise passed over the distinctive feature candidate and compared therewith in 
order to eliminate still further falsely identified features. The method is continued until all possible false alarm candidates (64 68 72 76) 
have been eliminated. The process is then repeated in order to preliminarily identify two or three landmarks of the target object 
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LIVE VIDEO INSERTION SYSTEM INCLUDING TEMPLATE MATCHING 

5 



BACKGROUND OF THE INVENTION 

10 

1 . Field of the Invention 

This invention relates to a device for inserting 
realistic indicia into video images. 

15 2 . Description of the Related Art 

Electronic devices for inserting electronic 
images into live video signals/ such as described in U.S. 
Patent 5,264, 933 by Rosser, et al., have been developed for 
the purpose of inserting advertising into broadcast events, 

20 especially sports events. These devices are capable of 
seamlessly and realistically incorporating new logos or 
other indicia into the .original video in real time, even as 
the original scene is zoomed, panned, or otherwise altered 
in size or perspective. In addition, in order to use these 

25 devices to alter a video feed downstream of the editor 1 s 
mixing device, electronic insertion devices have to be 
capable of dealing with scene cuts. This requires 
recognizing a feature or features reliably and accurately 
within a very short time, typically a few fields of video 

3 0 or about 1/3 0 th of a second. The need for fast 

recognition has meant that pyramid processing techniques, 
as described by Burt, et al . , tend to be used. Pyramid 
processing is a well known technique in which an image is 
decomposed, sometimes referred to as "decimated, " into a 

3 5 series of images, each of which comprises the whole 
original scene, but each with progressively less detailed 
information. Typically each successive image will have one 
quarter of the number of pixels of its predecessor. A 
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level 3 (or third generation) image has ~rth the number 
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of pixels of the original. A search for a gross feature 
can thus be done 64 times faster on a level 3 pyramid image 
and this result quickly related back to the level 0 or 
original image. Speed is also improvable by searching for 
a small number of distinct landmarks or features that 
characterize the target object. This simplification of the 
search strategy, however, increases the possibility of 
false alarms or insertions. The enormity of the false 
alarm problem can be appreciated from the fact that in a 
typical three hour football game, there are 648,000 fields 
of video. This means that in a single football game there 
are at least 600,000 opportunities for the insertion device 
to do something that would be commercially unacceptable, 
15 such as inserting an advertisement in the crowd, or on a 
group of players, just because of a chance juxtaposition of 
features that fool the computer into thinking the current 
scene is equivalent to a scene it is looking to find. To 
avoid this possibility, or at least reduce the risk of it 
20 occurring to an acceptable commercial level, it is 
necessary to have recognition strategies that, on average, 
would only make one error in at least twice as many 
attempts at recognition as would occur in the event being 
covered. For three hour football game, the computer must 
25 therefore make, on average, no more than one false 
insertion per 1.3 million fields of video. At the same 
time the search strategy must be kept sufficiently simple 
and invariant to changes in lighting conditions, video 
noise and incidental artifacts that may occur in the scene 
that it is attempting to recognize, that the recognition 
strategy can be performed by an affordable computing 
system in no more than 1/3 0th of a second. The final 
problem is that the systems capable of meeting these 
stringent requirements must be developed in a timely and 
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efficient manner. This includes verifying that performance 
goals are being attained. 

Typically, electronic insertion devices as 
5 described in U.S. Patent 5, 264,933 have used a dynamic 
pattern recognition method, as described in detail in US 
patent 5,063,603, the teachings of which are incorporated 
herein by reference. Briefly, as described in PCT WO 
93/06691, the preferred prior art dynamic pattern 

10 recognition method consists of representing a target 
pattern within a computer as a set of component patterns in 
a "pattern tree" . Components near the root of the tree 
typically represent large scale features of the target 
pattern, while components away from the root represent 

15 progressively Mner detail. The coarse patterns are 
represented at reduced resolution, while the detailed 
patterns are represented at high resolution. The search 
procedure matches the stored component patterns in the 
pattern tree to patterns in the scene. A match can be 

20 found, for example, by correlating the stored pattern with 
the image (represented in pyramid format) . Patterns are 
matched sequentially, starting at the root or the tree. As 
a candidate match is found for each component pattern, its 
position in the image is used to guide the search for the 

25 next component. In this way a complex pattern can be 
located with relatively little computation. However, 
such correlation methods, while having the advantage of 
speed when the search tree is kept to a reasonable size - 
typically no more than twenty correlation's in current 

30 hardware implementations - are liable to significant false 
turn on rate. This is caused in part by the need for a 
simple search tree and in part by a problem fundamental to 
correlation techniques. The fundamental problem with 
correlation techniques in image pattern match is that the 
35 stored pattern for each element of the search tree 
represents a particular pose of the object being looked for 
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- i.e. a particular magnification and orientation. Even 
if the system only requires recognition on the same or 
similar orientation, magnification remains a significant 
problem as in a typical broadcast application, such as 
recognizing Cootball goal posts. The difficulty is that 
the magnification of the goal post in the initial shot ( 
i.e. the first image of the required goal post in a 
sequence of images containing it) may vary by a factor of 
two. This means that the stored pattern is in general of 
the wrong size, making the correlation's weaker than in the 
case where the search pattern matches the image pattern 
exactly and thus more difficult to distinguish from other 
partially similar features. Traditional attempts to deal 
with this have been to include search trees containing 
15 images of different pose, particularly magnification. This 
results in longer search trees, and slower recognition. 
This is taken to an extreme in the system described in US 
Patent 5,353,392 by Laquent in which all attempts to 
automatically cope with scene cuts are abandoned and the 
identifying marks are indicated manually on the first image 
of each sequ-.ice . This may be adequate for a none real 
time editing machine, or for a real time electronic 
insertion device attached to a single camera in a situation 
where the recognition landmarks are never fully occluded, 
but is unacceptable in a standard broadcast environment 
with the electronic insertion occurring downstream of the 
editor's switching equipment, or at a remote location. 

In U.S. Patent 4,817,175, Tenenbaum, et al . , 
3 0 describes a pattern recognition system which uses parallel 
processing of the video input to attain speed. This system 
is directed towards inspection techniques in which the 
camera is under control of the recognition system and in 
which real-time performance is not required. The 
35 Tenenbaum, et al . system, therefore, uses time averaging of 
a number of frames of video to obtain high signal-to-noise 
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in the image. The heart of that recognition strategy, 
which in the preferred embodiment is set up to locate 
rectangles of varying size, is to look for corners, because 
of their invariance to magnification, using corner 
5 templates and standard correlation techniques. As an 
example, Tenenbaum, et al . describes a system which has 
templates representing a corner at all possible 
orientations. This is used to locate all possible lower 
left hand corners of possible rectangles. From these, the 

10 system detects corners and then looks along the diagonal 
for the matching upper right hand corner, using only the 
corner template having the correct pose. Finally, the 
system uses the predicted location of the other two corners 
of the rectangle as a means of confirming the existence of 

15 the rectangle, again using corner templates in the correct 
pose. All correlation is done in the traditional manner, 
using like templates. . 

The existing methods of structured pattern 
20 recognition used in electronic insertion devices require 
either relatively long" and complex search trees, resulting 
in prior art methods taking too much time with existing 
hardware to be of use in a real time, multi- camera 
environment under the range of conditions required by 
25 conventional broadcast practice or if the search trees are 
kept sufficiently simple, the search strategies become 
fragile, makii j them overly sensitive to false alarms in 
complex or noisy images, both of which are part of a real 
television broadcast. 

30 

SUMMARY OF THE INVENTION 

Briefly described, the invention comprises an 
improved method for the recognition of target patterns in 
35 a digitized image. The preferred method of the invention 
combines speed of search with robustness in practical 
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environments, especially to false alarms. The method 
includes all the tools for successfully implementing the 
method in a practical situation. 

5 The preferred method uses a structured search 

tree, in which the initial elements of the are kept simple, 
comprising 200m invariant features such as corners or 
edges. Then in the "outer branches" of the search tree, 
the method of the invention is to switch from standard 
10 matching techniques (i.e. correlation of a given pattern 
looking for a match to the same pattern in the image) , to 
a technique referred to here as "unlike feature 
correlation" in which patterns of one feature are 
deliberately used in a correlation over an area we believe 
15 to comprise another feature. For example, a pattern of a 
line is correlated centered on a part of an image we 
believe to be a • left handed corner. Suitable 
interpretation of the resultant correlation pattern of the 
two unlike features allows the method, for example, to 
2 0 verify the existence of a left hand corner very quickly and 
accurately. Thus it" is possible to keep the speed of 
structured pattern recognition with simple trees, without 
incurring their fragility to either different magnification 
or false turn on in complex images. This immunity to false 
2 5 alarm allows the system to further speed up the search by 
running several search strategies in parallel. (Each 
search strategy has to be twice as immune to false alarm in 
order to run two in parallel without impacting the total 
sensitivity to false alarm. Similarly 3 search strategies 
run in parallel require each to be 3 times as immune to 
false alarm and so on) . As a means of attempting to reduce 
the false alarm rate, it is common practice, having located 
a target wi/.h an initial search strategy, to verify this 
target by running correlation matches at a relatively large 
35 number (10 - 20) parts of the image, looking to find 
correlation values above a certain threshold. Because 
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correlations are usually run over a reasonable number of 
pixels - from 3 to 15 - this technique is a good positive 
confirmation, but is very open to false alarm. By using 
simple search-patterns, but then interpolating around the 
5 position of the maximum correlation, the system first gets 
sub-pixel inf oi nation about the location of part of the 
feature, and then the system uses a very strict geometrical 
check of the relative location of the parts of the scene it 
is attempting to find. 

10 

The combined strategy of search trees with built 
in "other feature correlation" feature verifiers, and the 
subsequent structure verification by a strict sub-pixel 
geometrical check of known relative positions of the 

15 overall structure, provides electronic insertion devices 
with the ability to accurately identify features within a 
full video image within one to two video fields, with a 
false turn on rates in the 1 in 2 million range on random 
video fields, which is roughly two orders of magnitude 

20 better than the conventional search tree strategies with 
just the peak value check of the structure. 

The final part of- the invention which allows this 
improvement in performance to be implemented on a routine 

25 basis is the verification or test part of the strategy. 
This has two parts. Firstly, the straight forward 

automatic logging of false alarms, including the capture of 
the image that caused the false alarm. This allows 
strategies to be tested over a large number of random 

30 frames. The second part of the test strategy first 
requires a reiteration of the way electronic video 
insertion devices operate. Having found the required 
object with the initial search strategy, the assumption is 
OOOthat for a while, the video will display a continuous 

35 scene, in which each video field is very similar to the 
previous one, with only relatively small changes in 
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magnification, translation, with a smaller amount of shear 
or rotation. On these subsequent fields, the computer 
thus has the much simpler task of, given the objects 
position and size in one field, finding it in the next. 
5 Totally different and much simpler strategies are then able 
to track the object from field to field. The test strategy 
to allow rapid assessment of the robustness of the 
preferred search-strategy is to force the system back to 
doing a search from scratch on every field or frame of 
10 video, even if the previous one was successful. This 
simple, but novel test allows the system to use relatively 
short video sequences to assess how given search trees will 
perform over a large number of initial scenes. 

15 The combined search and testing strategies 

comprise a method of producing practical, robust search 
mechani sms that allow- electronic insertion devices to be 
used in real time in realistic broadcast environments and, 
if necessary, with multi -camera feeds and downstream of 

20 editing and video effects machines. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a schematic representation of a live 
25 video insertion system, according to the preferred 
embodiment of the invention. 

Fi v . 2a is a schematic representation of pyramid 
decimation of an image as used in the preferred embodiment. 

30 

Fig. 2b illustrates both a schematic 
representation of a search tree as used in the prior art. 

Fig. 2c illustrates a search tree as used in the 
35 preferred embodiment. 
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Fig. 3 is a diagrammatic representation of both 
reference templates, objects in the scene, and the 
corresponding correlation surfaces that are generated when 
both like and unlike correlations are performed. 

5 

Fig. 4a illustrates the method of verifying 
existence of a specific object by multiple correlation, as 
used in the prior art. 

10 Fig. 4b illustrates an example of mismatch when 

using only the verification of existence of a specific 
object by multiple correlation. 

Fig. 5 illustrates the 2-D method of sub-pixel 
15 interpolation on correlation surfaces. 

Fig. 6 is -a flow diagram of a live video 
insertion syste ; incorporating the modifications of the 
preferred embodiment . 

20 

Figs. 7A - ~7C show the three classes of two 
dimensional linear invariant features of the type that the 
preferred embodiment of the invention may seek to identify. 

25 Fig. 8 illustrates a reverse L shaped target. 

Fig. 9 shows four linked correlation surfaces 
used to create a function which illustrates a generalized, 
highly certain method for detecting linear invariant 
30 features. 

DETAILED DESCRI PTION OF THE PREFERRED EMBODIMENT 

During the course of this description like 
35 numbers will be used to identify like elements according to 
the different figures which illustrate the invention. To 
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understand the invention it is easiest to take component 
parts and look at each in turn. 

The method of the unlike feature correlation is 
5 best understood by considering how electronic insertion 
devices do rapid pattern recognition. 

The electronic insertion system 10 gets an image 
taken by a conventional television camera 12 of a scene 14. 
10 The views from different cameras 12 are routed through an 
editor's switching controller 16 by means of which the 
director of the program (typically a trained human, not 
shown) edits together different camera shots to produce the 
television program that is sent to the end viewer over the 
15 network. Before the television program is sent over the 
network by means such as a broadcast antenna 18, the video 
feed is passed through the electronic insertion device 10. 
In this the incoming signal, typically in analog NTSC 
format, is converted into a digital format by an encoder 
20 20. The digital image 22 is then converted into a pyramid 
of images 24 by a pyramid producing device 26. The pyramid 
of the incoming live image 24 is then compared against a 
pyramid of a prestored reference image 28 which contains 
the target b;ing looked for. This comparison of incoming 
live image against the prestored reference image is done by 
a search comparator device 30 using a search tree of 
templates 32 which comprises a sequence of small (typically 
8 pixel by 8 pixel) sub-images (or templates) taken from 
the pyramid of the reference image 28. The search device 
30 3 0 uses these search templates 32 in a predetermined 
sequence in order to rapidly ascertain if the sought after 
target appears in the current image of interest and, if so, 
in what pose - i.e. what is the target's current 
magnification, translation and rotation with respect to the 
target represented in the pyramid of the reference image 
28. The unlike feature correlation part of this invention 
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is an improvement to existing practice which allows the 
search device 3 0 to do a rapid and robust search for the 
target . In a conventional dynamic pattern recognition 
search, as detailed in, for instance, U.S. Patent 
5 5,063,603, a typical search tree looks like that shown 
schematically in figure 2b. The first node 32 is an eight 
by eight pixel sub* image of the level 3 reference image 34 
( derived by 3 sequences of appropriate decimation from the 
level 0 image 36) representing, for example, a right hand 

10 corner of a football goal post. The pyramid may, in 
addition, be Gaussian filtered, or Laplacian filtered (or 
any other suitable filtering) , as discussed in detail by 
Burt. A typical mode of search is to step through the 
level 3 pyramid of the incoming image in a raster fashion 

15 doing correlations with the first node template 32. At 
level 3, in an NTSC system, the size of the image of a 
single field is 90 pixels long by 30 pixels deep. Using a 
hardware correlator that can do correlations over a 15 by 
15 region in a single pass, such as the Data Cube MaxVideo 

20 20 hardware board, the first node of the search tree can be 
searched for in 12 passes or correlations, each of which 
takes about 1 millisecond. (A field time of NTSC video is 
l/60th of second, or 16.6 milliseconds). The maximum 
correlation is found and, if above an experimentally 

25 determined threshold, it is assumed to have found a right 
hand corner, as that is the pattern that will give the 
single highest match or correlation. Having found the 
right hand corner, the search algorithm then looks to see 
if there is a left hand corner in the appropriate place. 

30 It does this I / taking the template 38 of a left hand 
corner taken from the reference pyramid image 34, and does 
a single correlation at the offset from the position of the 
right hand corner just located, based on the distance 
between corners in the reference level 3 image 34. The 

35 magnification between the incoming live image and the 
prestored reference image is assumed to be 1.0. Because 
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the correlation is over a 15 by 15 region there is some 
tolerance in the magnification of the incoming image that 
can be deteci,*d, even with this initial assumption that the 
incoming image has the same magnification as the reference 
image. This tolerance varies with the size of the feature 
being looked for. For example, if the goalpost in the 
reference level 3 image spans half the image, i.e., is 45 
pixels wide, the allowed range of magnification for the 
search to be successful is + - 18% of the reference image. 



If at the predicted location, a correlation of 
the right hand reference template 38, produces a peak value 
greater than a predetermined threshold, then a matching 
left hand corner is assumed to have been found. Based on 

15 the position indicated by the location of the maximum 
correlation, the search algorithm updates the estimate of 
the magnification of the incoming image with respect to the 
reference image. The search then proceeds to the next 
search template, 4 0 which in this example is the level 2 

20 right hand corner, and runs a correlation using the 
position of the right hand corner predicted by the level 1 
search. The maximum of this correlation is assumed to be 
a more accurate position for the right hand corner of the 
goalpost. Using the magnification of the incoming image 

25 with respect to the reference image calculated in the level 
three part of the search, the algorithm then looks for the 
right hand corner of the goal post with the right hand goal 
post reference template 44. As before, the peak 
correlation position, if above an acceptable threshold, is 

30 assumed to be a more accurate position of the right hand 
post in the incoming image. The procedure may be repeated 
one more time using level one templates 46 and 48 taken 
from the level 1 reference pyramid image 50 for greater 
positional accuracy. The total search of this example 

35 takes 17 correlations, or just over one field, with a 
magnification tolerance of roughly + - 20 %. However, 
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the simplicity of the search tree and its having only an 
experimentally determined threshold to indicate whether or 
not a match between template and image really indicate the 
location of a feature, would result in an unacceptably 
5 large number of false positive matches. 

By contrast, a typical search tree employed by 
the method of the preferred embodiment of this invention, 
i.e., unlike fe-~ure correlation, is shown in Figure 2c. 

10 The target of the search is the same as in the previous 
example. The method of unlike feature correlation starts 
in a similar manner, using a level 3 pyramid of the right 
hand corner 52, and does an exhaustive search of the level 
3 pyramid of the incoming pyramid image in 12 correlations 

15 of 15 by 15. As before, it initially assumes that maximum 
peak of all 12 correlations indicates a right hand corner. 
The search then takes a level 1 right hand corner template 
54 and does a correlation of this pattern on the level 1 
pyramid of the incoming image in order to get an accurate 

20 position of the right hand corner. The next step is to do 
an unlike feature correlation of a vertical line template 
56 centered on the suspected corner. This is done to 
verify that the corner is indeed a corner and not some 
similar geometric figure that would give a similar 

25 correlation. 

To understand the power of the method of unlike 
correlation in distinguishing between similar geometrical 
objects, it is first necessary to consider how conventional 

30 techniques of identifying a pattern by correlation with a 
like pattern can lead to errors. Figure 3 shows how a 
reference template of a right ' hand corner 58, when 
correlated against a right hand corner 6 0 gives correlation 
surface 62, which is in practice indistinguishable from the 

35 correlation surface 66 resulting from the correlation of 
the right hand corner template 58 against a cross 64. 
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Similarly, the correlation of the right hand corner 
template 58 against a T feature 68 results in a correlation 
surface 70, indistinguishable in a real, noisy environment 
from correlation surface 62. Even the correlation of the 
5 right hand corner template 58 against a left hand corner 
image 72 results in a correlation surface 74, which will, 
in general have a lower maximum than the correlation 
surface 62, but is so similar in structure that in a noisy 
practical environment may easily be mistaken for the 

10 correlation surface 62. As a final example the result of 
correlating the right hand corner template 58 against a 
vertical T image 76 is shown in the correlation surface 78. 
Once again the differences between the correlation surface 
78 and 62 are subtle variations in values, easily confused 

15 by noise, variations in magnification or changes in 
intensity. 

Columns 2 and three of Figure 3 show examples of 
unlike feature correlation. In these the correlation 
surfaces provide robust evidence to verify that the object 
is a right hand corner or reject it as being some other 
geometrical shape. 



20 



In column 2, the horizontal bar template 80 is 
correlated against the right hand corner template 60, 
25 resulting in the correlation surface 82, an intensity map 
of the central row of the surface 82 is also shown 84. 
This correlation surface 82 has the distinguishing features 
that the maximum intensity occurs at the extreme left hand 
corner and that the intensity of the central row falls off 
sharply, in a predictable staircase as shown in 84. By 
contrast, t\<?. correlation surface 86 obtained by 
correlating the horizontal bar template 80 against the 
cross feature 64 is a single bar of near uniform intensity, 
as seen from the intensity map of the peak row 88. This 
35 correlation surface is readably distinguishable from the 
required correlation surface 82. Similarly, the 



30 



14 



WO 96/24115 



PCTOJS96/01125 



correlation of the horizontal bar 80 against a horizontal, 
inverted, T 6 8 and a left hand corner 72 result in the 
correlation surfaces 90 and 94, with corresponding 
intensity maps the peak row 92 and 96. Both correlation 
5 surfaces 90 and 94 are readily distinguishable from the 
correct surface 82. Only in the case of the correlation of 
the horizontal bar 80 against a vertical T 76 are the 
resultant correlation surface 98 and peak row intensity map 
100 virtually indistinguishable from the correct 

10 correlation surface 82. However, in this case the 
correlation of a vertical bar 102 shown in the right hand 
corner can be used in conjunction with the horizontal bar. 
This can be seen by comparing the correlation surface 104 
obtained by correlating the vertical bar 102 against a 

15 right hand corner 60, with the very different correlation 
surface 108 obtained from correlating the vertical bar 102 
against the vertical T- 76. The correlation surfaces 108 
and 104 are readily distinguishable. 

20 From Figure 3 it is clear that there are 

ambiguities inherent ifi the correlation process in which a 
reference template of the object being sought, is 
correlated over an image, or portion of an image, looking 
for the best match. These ambiguities stem from the fact 

25 that related geometrical elements can lead to false peaks 
with very similar correlation surfaces. In addition, it is 
clear from Figure 3 that these ambiguities can be resolved 
in a practical way by further correlations of suitably 
chosen reference templates, deliberately unlike the object 

30 being sought, alone or in combination with each other. The 
key feature is that the unlike templates lead to 
correlation surfaces that differ not merely in subtle 
changes of peak intensity, but have predictable geometric 
structures which are markedly different even for related 

35 geometrical structures. 
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The search tree of the unlike correlation method 
of the invention locates a candidate right hand corner by 
correlation of the level 3 right hand corner template 52 on 
the level 3 incoming image. The position of the most 
likely candidate is chosen, and its location defined more 
accurately by a level 1 correlation using the level 1 right 
hand template 54. A check is then done to see if the 
candidate really is a right hand corner by first doing a 
level one unlike correlation using a vertical line 56, and 
checking boti- that the peak of that correlation occurs in 
one of the upper three positions of the peak row of the 
correlation surface, as shown in 104, and that the 
intensity of the peak row falls off rapidly as show in 106. 
If the candidate corner does not match, it is possible to 
15 go back to the second most likely candidate in the initial 
level 3 search, and investigate that. If the candidate 
does pass the vertical bar test, it can be further 
investigated by the level 1 horizontal bar reference 
template 112. The correlation surface should now 
20 correspond to that shown in 82 and the intensity of the 
peak row to that in 84. From these two tests it is now 
evident that we have located a right hand corner. The left 
hand corner can then be sought using the reference template 
112, and doing the correlation one step to the left of the 
25 position of the right hand corner. This stepping allows us 
to simultaneously check for continuity of the horizontal 
bar and to determine that when we do reach the end, the 
peak line of the correlation surface falls of in the 
appropriate staircase fashion indicated by the correlation 
30 surface 94 in figure 3. A final verification to ensure 
that the search has arrived at a left hand corner would be 
to run a correlation of a level l vertical bar reference 
template 114 and check against the expected correlation 
surface 116 with a corresponding map of peak row intensity 
35 118. 



16 



WO 96/24115 



PCIYUS96/01125 



10 



The total search indicated would take about 19 
correlations (assuming the stepping along the bar takes 4 
level 1 correlations, covering 60 pixels or 0.16 of the 
image. The stepping may be done at either level 2 or level 
3 to increase the span for a given number of correlations) . 
There are obvious extensions to the use of unlike feature 
correlations such as checking for ends of lines. The 
principal advantages of the unlike correlation method are 
that by doing strict element analysis early on it not only 
drastically reduces the number of false alarms, but does so 
early enough to allow lesser candidates to be considered 
without wasting correlations researching the entire image. 

Having done a rapid search, the next phase of the 
15 conventional live video insertion system is to do a 
verification by doing a larger number of correlations 
(typically between 10 and 15 with existing Datacube Max20 
hardware) on the level 1 pyramid of the incoming image to 
further check that the object identified is in fact the 
20 object being sought. As illustrated in Figure 4a, these 
correlations are centered on points 12 0 that model the 
target in the incoming image using the pose (translation, 
magnification and rotation) identified by the search. 
Although the verification does cut down on the number of 
25 false alarms, it has a problem shown in figure 4b. Here, 
although the target is not in view, because each of the 
correlations, though centered on the model 120, extends 
over a range of 15 by 15 pixels 122, so that a random 
collection of arbitrary objects 126 (players, for example) 
30 can give rise to correlations whose peak value 124 is above 
the required experimental threshold indicating an adequate 
match) . The result is that the algorithm treats the random 
collection of arbitrary objects as being equivalent to 
having verified the existence of the goalpost 128. This 
effect can be reduced by reducing the area 122 over which 
the correlation's are done or even more effectively, by 
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first doing sub-pixel interpolation on the correlation 
surfaces to determine the exact coordinates of the matches 
124, and then checking that these coordinates are aligned 
to each other to fit a model of the search object (i.e. 
5 goalpost 128) to sub-pixel accuracy. This geometric check 
of the verification stage is an important adjunct to the 
unlike correlations checks. Combined they allow the simple 
correlation search strategies detailed above to achieve 
rapid (less than l/30th of a second) positive recognition, 
10 with false turn on rates greater than 1 in l million 
attempts, which is about two orders of magnitude better 
than the same fast correlation techniques without the two 
levels of checking. 

15 Sub-pixel interpolation at a verification stage 

is known. it is an infrequently used and poorly 
documented, but very effective technique, and is 
illustrated in Figure 5 for a one dimensional case. The 
peak correlation value 132 and the two values on either 
side of it, 130 and 134 are considered. The first step is 
to determine which is" the smaller of the two values on 
either side the peak 132, which in Figure 5 is 130. A 
line 136 is then drawn through the peak 132 and the lesser 
value 130 to obtain the angle 138. A second line 140 is 
25 then drawn through the higher peak 134 at an angle to the 
horizontal which is 180 degrees minus the angle 138. The 
intersection 142 of the two lines 136 and 140 give sub- 
pixel value 144 of the correlation. This method of 
reconstructing a triangle has some theoretical 
justification in that the correlation of a rectangle 
function with itself is a triangle. The method has also 
been found to be the most consistent and accurate 
experimentally and can readily be extended to the two 
dimensional case. This sub-pixel technique, though 
35 occasionally used in other contexts, lends itself well to 
usage with the novel aspects of this invention. 



20 



30 



18 



WO 96/24115 



PCT/US96/01125 



The search strategy detailed above, with the 
unlike correlation checks and the final geometrical check 
provides a means of getting the positive search and the 
false search patterns down to an acceptable minimum. The 
5 final part of being able to implement such searches in 
practice and know that they will perform as required is to 
have appropriate tools for checking them effectively. 

The false alarm case can be handled most readily 
10 by having a simple logging function incorporated in the 
software as ^shown in the flow chart in Figure 6. 

This logging function, when activated, stores the 
first image the machine sees each time it turns on. This 
15 may be done by directly loading down one of the delay 
memories 150 in Figure 1, or by first copying that image to 
a special memory surface 152, which is not in the stream of 
continuous video and then down loading that memory in non 
real time to the control computer 153 . The system can then 

2 0 be run unattended for extended periods of time, (i.e. 

overnight, which is roughly 2 million fields of video) and 
effectively watch a random video stream. At the end of the 
period, i.e. the next morning, an operator can then see how 
many images the machine recognized, and which ones they 
25 where. This not only allows the false turn on rate to be 
calculated, but gives the operator insight into what caused 
the false turn-ons, allowing the operator to take 
corrective action by altering the search strategy if 
necessary. 

30 

The positive turn on testing requires a more 
subtle tool, also indicted in the flow diagram in figure 6. 
Because the testing is typically done from a limited set of 
tape recordings of a prior event in the same stadium, and 

3 5 because once the search has been successfully completed, 

the live video insertion machine 10 switches into a 
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tracking moc^, in which the tracking comparator 154 
compares a set of tracking templates 156 with the incoming 
scene, bypassing the search comparator 30, the operator 
would typically only have a limited number of transition 
scenes at which the machine does its recognition. However, 
by incorporating a flag that allows the machine to always 
fail after one insertion, i.e. after the logo stored in 
memory 158 has been combined with the actual, delayed video 
150 using the warp, mask occlude and insert module 160 to 
give the final composite output 162, the machine can be 
required to t attempt to recognize every video field in a 
given sequence. This always fail flag allows an operator 
to very quickly see, even on a limited tape sample of a 
prior game, if the current search strategy would be 
applicable over any of the camera angles and magnifications 
that might be an initial field in some future game. 



The unlike correlation method of the preferred 
embodiment illustrated in Figs. 1 - 3 may be thought of as 
one particular, and efficient, implementation of a more 
general method for invariant linear feature detection by 
correlation. Invariant linear features are defined as ones 
that in a two dimensional plane provide x and y 
information, but whose appearance is independent of zoom - 
25 i.e., the features look the same at any zoom. 

Three classes of 2 - dimensional linear invariant 
features are shown in Figs 7a - 7c. 

The simplest invariant linear feature is a line 
200, of any orientation, that ends within the region of 
30 interest as shown in Fig. 7a. 

The next most complex, and the most practically 
useful set, of invariant linear features consist of two 
lines shown as 202 and 204 in Fig. 7b, of any orientation, 
though not identical, which either meet, or cross. As 
35 shown in Fig. 7b there are three cases of such lines: 
either they meet at two endpoints as shown by feature 206 
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(L shaped) ; or they meet at one end point as shown by 
feature 208 (T shaped) ; or they intersect each other as 
shown by feature 208 (X shaped) . 

Three or more lines such as lines 212, 214, 216 
5 and 218 illustrated in Fig. 7c are only zoom invariant if 
they meet or cross at a single point as shown by feature 
220. Such structures are less common in video images. 

The case of two orthogonal lines 3 02 and 3 04 
illustrated in Fig. 8 meeting at a point as a reverse L 
10 shaped feature 300 will be discussed as an example, though 
it will; be clear that the generalized method can be adapted 
for many sophisticated cases of zoom invariant linear 
features . 

A reversed L shaped feature 3 00 as shown in Fig. 
15 8 can be detected with a considerably reduced chance of 
confusion with related two line structures by maximizing 
the function T: 



T n Sun<* sun(P(cot, 90, <r\), <fn*v> - tun (P(row, 0, <n-h), (m)> 
20 + sun(P<row, 0, (mh), (m) - sun <P(col, 90, (n), <m-v>)> 

where the term sum (P (col, 90, (n ) , (m ) ) is interpreted as 
being the sum of the values of the peak column of the 
correlation of a line at 90 degrees to the horizontal 

25 (i.e., a vertical line template), centered at the nth 
horizontal correlation position, and the mth vertical 
correlation position) . As the correlation of a line with 
a line gives a correlation surface with a corresponding 
line of high values. By summing the peaks along the 

30 correlation surface, a first check on the existence of the 
line is possible. Some degree of continuity is also 
implicit in the summing along a peak row or column. 
However, if the time or compute power are available, more 
discriminating go-nogo checks may be included, such as 

35 checking on either side of the column with the highest sum 
of correlations for either a column with negative sum, or 
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individual negative values. The variance of the individual 
values in the peak column, or other statistical quantity 
may also be included, either as a go-nogo check or as a 
weighting factor in slightly modified variants of the 
function T defined above. 

Fig. 9 illustrates more specifically the 
correlation surfaces 40 0 generated when looking for the 
reverse L-shaped feature 300 of Fig. 8. A first 
correlation search rectangle looks for row correlation and 
generates a first and a second pair of correlation surfaces 
406 and 410. We know that we have likely identified the 
horizontal row feature 304 of the reverse L-shaped feature 
300 when 406 is maximum and 410 is minimum. Correlation 
surface 406 correlates the horizontal line 304, centered on 
15 correlation point (n - h, m) and correlation surface 410 is 
the resulting correlation surface centered on correlation 
position (n + h, m).. Likewise, the correlation search 
rectangle 404 produces two correlation surfaces 408 and 412 
which look for column feature 3 02 of reverse L-shaped 
20 feature 300 of Fig. 8. Correlation surface 408 correlates 
with vertical line 302 and is centered on correlation 
position (n, m + v) and correlation surface 412 is centered 
on position (n, m - v) . Feature 302 is detected when 
correlation surface 408 is maximum and 412 is minimum. 
25 Therefore, the total reverse L-shaped feature 300 is 
detected when the function sum or total (T) is maximum or 



30 



T = surface 406 - surface 410 ♦ surface 408 - surface 412 



° max i nun 



This can be further generalized for the two 
orthogonal lines case (Fig. 7b) as follows: 

r *£ rov i * E row 2 + £ col x ± £ col 2 - rnaximum 

If the two lines are straight but not orthogonal 
this can be generalized as follows: 
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T-£ (£rowx l ± Y^ rowa i * 130° + £cola 2 ± £ cola 2 ± 180° - maximum 
where 

a 1 = angle of line 1 with respect to abscissa (x axis) 
a 2 « angle of line 2 with respect to abscissa (x axis) 
5 For a multiline system of straight lines 

converging at a point with angles of intersection of a,, 
a 2 . . . a 8 the generalized function T can be expressed as 

T i - ico a " £ PcL i n ^ * ]C Pa i v * n + h ^ " maximum 

10 where : ; 

v = vertical distance between m and m + v 
h = horizontal distance between n and n + h 

For practical purposes there are generally going 
to be at least two straight lines converging at a point so 

15 T i - xcoa - ]£ (J] Pa i (/n, n) ± ^Pa^ni+v, n+h) - maximum 

where i = 2 or greater 

While the invention has been described with 
reference to a preferred embodiment, it will be appreciated 
by those of ordinary skill in the art that changes can be 
20 made to the structure and operation of the invention 
without departing from the spirit and structure of the 
invention as a whole. 
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WE CLAIM: 

1. A pattern recognition method for recognizing 
an object having distinctive features as imaged in a video 
field, said method comprising the steps of: 

(a) passing a first template having a first 
pattern similar to one of said distinctive features over 
said video field and comparing the same in order to 
preliminarily/ identify at least one possible distinctive 
feature of said object which could be either a correctly 
identified distinctive feature or one of a plurality of 
incorrectly identified features; and, 

(b) passing a second template having a 
second pattern different from said first template pattern 
over said possible distinctive feature and comparing the 

15 same in order to determine if said possible distinctive 
feature is at least one of said incorrectly identified 
features . 



10 



2. The method of claim 1 further comprising the 

20 step of: 

(c) passing a third template having a third 
pattern different from said first and second template 
patterns over said possible distinctive feature and 
comparing the same in order to determine if said possible 
25 distinctive feature is at least another of said incorrectly 
identified features. 



3 . The method of claim 2 further comprising the 

step of: 

30 < d > repeating steps (b) and (c) above until 

all possible likely falsely identified features have been 
eliminated as possible candidates for said distinctive 
feature, 

wherein said correctly identified feature is 
35 accurately identified by process of elimination. 
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4 . The method of claim 3 comprising the step 

of: 

(e) selecting said second pattern from a 
group comprising a first element of said distinctive 
feature and extending it completely across said second 
template . 

5. The method of claim 4 further comprising the 

step of: 

(f) selecting said third pattern from a 
group comprising a second element of said distinctive 
feature and extending it completely across said third 
template . 

15 6. Ihe method of claim 5 further comprising the 

steps of : 

(g) repeating steps (a) - (f) above to 
locate at least two potential distinctive features of said 
object; and, 

20 (h) geometrically comparing the location of 

said at least two potential distinctive features of said 
object against a geometric model of said object to further 
determine if said object has been correctly identified. 

25 7. The method of claim 6 further comprising the 

step of: 

(i) recording a series of consecutive video 
frames and verifying the accuracy of said method for 
recognizing an object by testing it against said series of 
30 video frames over a period of time to determine if it has 
correctly identified said object. 

8, The method of claim 7 further comprising the 

step of: 

35 (j) determining the sub-pixel maximum value 

of at least a first, second and third pixel each having a 
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top with a midpoint and all having a common baseline by 
forming a first line through the midpoint of the top of 
said first and second pixels, said first line forming an 
angle a wiUi respect to said common baseline and 
5 subsequently drawing a second line through the top midpoint 
of said third pixel, said second line having an angle 180° - 
a with respect to said common baseline, 

wherein the triangular intersection of said 
first and second lines approximates the location of the 
10 maximum value of said three pixels. 

9 . The method of claim 8 further comprising the 

step of: 

(k) inserting an always fail flag into the 
15 end of each frame tested in step (i) thereby forcing the 
method to repeat steps (a) - (f) at least once per field, 

wherein an effort is made to recognize said 
object once each frame. 

20 10 The method of claim 9 further comprising the 

step of: 

(1) inserting an alternative image in place 
of said object in said video field. 

25 11 ■ The method of claim 2 wherein at least two 

templates are employed and a distinctive feature is 
determined to exist when the following total function T is 
maximized : 

r "Z ( Z Pa i {m > -Pet , (m+v. n+h) - maximum 

where T = Total or Sum of function 
P = a given line 

angle of a given line P to abscissa 
i = 2 or more 

m = location of at least a first search template on 
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the vertical, i.e., ordinate, axis 
location of at least a first search template on 
the horizontal, i.e., abscissa, axis. 

vertical offset of at least a second search 
template from point m 

horizontal offset of at least a second search 
template from point n. 

12. The method of claim 11 wherein said 
10 distinctive feature is located at the intersection of two 
orthogonal straight lines by maximizing the function: 

(+£ (Pcol, 90, <ji), im+v)-^ {?{iow,0, (n-h) , (m) 
(P(rov,o, ln+h) im) (P(col,90, (n) , (m-v) -maximum 

15 13 . A pattern recognition method for recognizing an object 
having landmark features as imaged in a video field, said 
method comprising the steps of: 

(a) correlating a first template having a 
first pattern similar to one of said landmark features with 

20 respect to said video field and generating a first 
correlation surface to "preliminarily identify at least one 
candidate landmark feature of said object which could be 
either a correctly identified landmark feature or one of a 
plurality of falsely identified landmark features; 

25 and, 

'b) correlating a second template having a 
second pattern with unlike feature correlation with respect 
to said first template pattern to said candidate landmark 
feature and generating a second correlation surface to 
30 determine if said candidate landmark feature is at least 
one of said falsely identified landmark features. 

14. The method of claim 13 further comprising 
the step of: 

35 (c) correlating a third template having a 

third pattern with unlike feature correlation with respect 



n = 



v = 
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to said first and second template patterns over said 
candidate landmark feature and generating a third 
correlation surface in order to determine if said candidate 
landmark feature is at least another of said falsely 
5 identified landmark features. 

15. The method of claim 14 further comprising 
the step .of : 

(d) repeating steps (b) and (c) above until 
all possible likely falsely identified landmark features 
10 have been eliminated, 

v wherein said correctly identified landmark 
feature is accurately identified by process of elimination 
of said falsely identified landmark features. 
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16. The method of claim 15 further comprising 
the step of : 

(e) selecting said second pattern from a 
group comprising a first element of said landmark feature 
and extending it completely across second template. 

17. The method of claim 16 further comprising 
the step of: 

(f) selecting said third pattern from a 
group comprising a second element of said landmark feature 

25 and extending it completely across said third template. 

18. The method of claim 17 further comprising 
the step of: 

(g) repeating steps (a) - (f) above to 
30 locate at least two potential landmark features of said 

object; and, 

(h) geometrically comparing the location of 
said at least two potential landmark features of said 
object against a geometric model of said object to further 

35 determine if said object has been correctly identified. 
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19. The method of claim 18 further comprising 
the step of : 

(i) recording a series of consecutive video 
frames and verifying the accuracy of said method for 
5 recognizing an object by testing it against said series of 
consecutive video frames over a period of time to determine 
if it has correctly identified said object. 

20. The method of claim 19 further comprising 
10 the step of: 

; ^ (j) determining the sub-pixel maximum value 
of at least a first, second and third pixel each having a 
top with a midpoint and each having a common baseline by 
forming a first line through the midpoint of the top of 
said first and second pixels, said first line forming an 
angle a with respect to said common baseline and 
subsequently drawing a ..second line through the top of the 
midpoint of said third pixel and having an angle 180° - a 
with respect to said common baseline, 

wherein the triangular intersection of said 
first and second lines" approximates the location of the 
maximum value of r;aid three pixels. 

21. The method of claim 20 further comprising 
25 the step of: 

(JO inserting an always fail flag into the 
end of each frame tested in step (i) above thereby forcing 
the method to repeat steps (a) - (f) at least once per 
frame , 

wherein an effort is made to recognize said 
object once each field. 

22. The method of claim 21 further comprising 
the step of: 

(1) inserting an alternative image in place 
of said object in said video field. 
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23. The method of claim 14 wherein at least two 
correlation surfaces are generated and a landmark feature 
is determined to exist when the following total function is 
maximized : 

5 r "E ( E Pa i im > a) ± £ Pa i (m* v, ij+A) - maximum 

where T = Total or Sum of function 
P = a given line 

a,- angle of a given line P to abscissa 
i> = 2 or more 

10 m" = Ideation of at least a first search template on 

the vertical, i.e., ordinate, axis 
n = location of at least a first search template on 

the horizontal, i.e., abscissa, axis, 
v = vertical offset of at least a second search 
15 template from point m 

h = horizontal offset of at least a second search 
template from point n. 
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24. The method of claim 23 wherein said landmark 
feature is located at the intersection of two orthogonal 
straight lines by maximizing the function: 

<+£ (PcoJ, 90, in) , (m*v>-£ {PUow.O. (n-h) . im) 
iPizow.o, <„♦/,) ( m >-£ {p( Coi ,9o. <„, , lm . v) . aurfau,,, 

25. A system for recognizing an object having landmark 
features as imaged in a video field, said system 
comprising : 

scanning means for scanning said object and 
forming a series of video fields; and, 

correlating means for correlating a first 
template having a first pattern similar to one of said 
landmark features with respect to said video field and 
generating a first correlation surface to preliminarily 
identify at least one candidate landmark feature of said 
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object which c ->jld be either a correctly identified 
landmark feature or one of a plurality of falsely 
identified landmark features and for correlating a second 
template having a second pattern with unlike feature 
correlation with respect to said first template pattern to 
said candidate landmark feature and generating a second 
correlation surface to determine if said candidate landmark 
feature is at least one of said falsely identified landmark 
features , 

: wherein said correctly identified landmark 

feature: is identified by process of eliminating possible 
falsely identified landmark features. 

26. The system of claim 25 further comprising: 
insertion means for inserting an alternative 
image in place of said object in said video field. 
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AMENDED CLAIMS 

[received by the International Bureau on 12 June 1996 (12.06.96); 
original claims 1-26 replaced by amended claims 1-27 (11 pages)] 

WE CUUM: 

1. A pattern recognition method for recognizing an object having distinctive 
features as imaged in a video field, said method comprising the steps of: 

(a) passing a • template having a first pattern similar to one of said 
distinctive features over said video fietd and comparing the same in order to preliminarily 
identify at least one possible distinctive feature of said object which could either be a 
correctly identified distinctive feature or one of a plurality of incorrectly identified features; 

(b) passing a second template having a second pattern different from said 
first template pattern over said possible distinctive feature end comparing the same in order 
to determine if said possible distinctive feature is at least one of said incorrectly identified 
features; 

(c) passing a third template having a third pattern different from said first 
and second template patterns over said possible distinctive feature and comparing the same 
in order to determine if said possible distinctive feature is at least another of said Incorrectly 
identified features; 

(d) repeating steps (b) and_(c) above until all possible likely falsely identified 
features have been eliminated as possible candidates for said distinctive feature; 

(e) selecting said second pattern, wherein said second pattern includes a 
first element of said distinctive feature which extends substantially completely across said 
second template, 

wherein said correctly identified feature is accurately identified by process of 
elimination. 

2. The method of claim 1 further comprising the step of: 

(f) selecting said third pattern, wherein said third pattern includes a second 
element of said distinctive feature which extends substantially completely across third 
template. 
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3. The method of claim 2 further comprising the steps of: 

(g) repeating steps (a) - (f) above to locate at least two potential distinctive 
features of sard object; and 

(h) geometrically comparing the location of said at least two potential 
distinctive features of said object against a geometric model of said object to further 
determine if said object has been correctly identified. 

4. The method of claim 3 further comprising the step of: 

(i) recording a sp h ss of consecutive video frames and verifying the 
accuracy of said method for recognizing an object by testing it against said series of video 
frames over a period of time to determine if it has correctly identified said object 

5. The method of claim 4 further comprising the step of: 

Q) determining the sub-pixel maximum value of at least a first, second and 
third pixel each having a top with a midpoint and ail having a common baseline by forming a 
first line through the midpoint of the top of said first and second pixels, said first line forming 
an angle a with respect to said common baseline and subsequently drawing a second line 
through the top midpoint of said third pixel, said second line having an angle 180° - a with 
respect to said common baseline, 

wherein the triangular intersection of said first and second lines 
approximates the location of the maximum value of said three pixels. 

6. The method of claim 5 further comprising the step of: 

(k) inserting an £i*ays fail flag Into the end of each frame tested in step (i) 
thereby forcing the method to repeat steps (a) * (f) at least once per field, 

wherein an effort is made to recognize said object once each frame. 

7. The method of claim 6 further comprising the step of: 
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(I) inserting an alternative image in place of said object m said video field. 

8. The method of claim 1 wherein at least two templates are employed and a 
distinctive feature is determined to exist when the following total function T is maximized: 

T=]£ [J2 < m ' n > ±£ **** (m+v, n+h)] = maximum 



where T = Total or Sum of function 
P = a given fine 

a, = angle of a given line P to abscissa 
i = 2 or more 

m = location of at least a first search template on the vertical, Le., ordinate, axis 
n = location of at least a first search template on the horizontal, i.e., abscissa, axis, 
v = vertical offset of at least a second search template from point m 
h = horizontal offset of at least a second search template from point n. 

9. The method of claim 8 wherein said distinctive feature is located at the intersection 
of two crthogonal straight lines by maximizing the function: 

T=£ [♦£ (Pcol, 90, in), {m+v) -£(P{row,0, (n-h) , (m) 
[P(row t 0, (n+h) {m) -Jj {Picol, 90, (n) , (m-v)^ = maximum 

10. A pattern recognition method for recognizing an object having landmark features 
as imaged in a video field, said method comprising the steps of: 

(a) correlating a first template having a first pattern similar to one of said 
landmark features with respect to said video field and generating a first correlation surface to 
preliminarily identify at least one candidate landmark feature of said object which could be 
either a correctly identified landmark feature or one of a plurality of falsely identified landmark 
features; 
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(b) correlating a second template having a second pattern with unlike 
feature correlation wtth respect to said first template pattern to said candidate landmark 
feature and generating a second correlation surface to determine if said candidate landmark 
feature is at least one of said falsely identified landmark features; 

(c) correlating a th?rd template having a third pattern with unlike feature 
correlation with respect to said first and second template patterns over said candidate 
landmark feature and generating a third correlation surface in order to determine if said 
candidate landmark feature is at least enother of said falsely identified landmark features; 

(d) repeating steps (b) and (c) above until all possible likely falsely identified 
landmark features have been eliminated, 

(e) selecting said second pattern, wherein said second pattern includes a 
first element of said landmark feature which extends substantially completely across said 
second template, 

wherein said correctly identified landmark feature is accurately identified by process of 
elimination of said falsely Identified landmark features. 

1 1 The method of claim 10 further comprising the step of: 

(0 Selecting said third pettem, wherein said third pattern includes a 
second element of said landmark feature which is extended substantially completely across 
said third template. 

12. The method of claim 11 further comprising the steps of. 

(g) repeating steps (a) • (0 above to locate at least two potential landmark 
features of said object; and 

(h) geometrically comparing the location of said at least two landmark 
features of said object against a geometric model of said object to further determine if said 
object has been correctly identified. 
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13. The method of claim 1 2 further comprising the step of: 

(i) recording a series of consecutive video frames and verifying the 
accuracy of said method for recognizing an object by testing it against said series of video 
frames over a period of time to determine tf it has correctly identified said object. 

14. The method of claim 1 3 further comprising the step of: 

(j) determining the sub-pixel maximum value of at least a first, second and 
third pixel each having a top with a midpoint and all having a common baseline by forming a 
first One through the midpoint of the top of said first and second pixels, said first line forming 
an angle a with respect to said common baseline and subsequent!/ drawing a second line 
through the top midpoint of said third pixel, said second line having an angle 180° - a with 
respect to said common baseline, 

wherein the triangular intersection of said first and second lines 
approximates the location of the maximum value of said three pixels. 

1 5. The method of claim 14 furtner comprising the step of: 

(k) inserting an always fail flag into the end of each frame tested in step (i) 
thereby forcing the method to repeat steps (a) - (f) at least once per field, 

wherein an effort is made to recognize said object once each frame. 

16. The method of claim 15 further comprising the step of: 

(l) inserting an alternative image in place of said object in said video field. 

17. The method of claim 10 wherein at least two correlation surfaces are 
generated and a landmark feature is determined to exist when the following total function is 
maximized: 

T=£ |£ Potjm, n) ± £ Pct i (m+v, n+h) j = maximum 
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where T * Total or Sum of funcfc ■= : 

a, = angle of a given line P to abscissa 

i = 2 or more 

m = location of at least a first search template on the vertical, i.e., ordinate, axis 

n = location of at least a first search template on the horizontal, i.e., abscissa, axis, 

v * vertical offset of at least a second search template from point m 

h = horizontal offset of at least a second search template from point n. 

1 8. The method of daim 17 wherein said landmark feature is located at the 
intersection of two orthogonal straight lines by maximizing the function: 

r^E [ (Pcoi,90, in), im+v) -£(F(rov,0, (n-h) , (ra) 
+2- < p <«W/0, (n+h) , (m)'22 (Picol, 90, in), (ra-v) ] = maximum 

19. A pattern recognition method for recognizing an object having distinctive 
features as imaged in a video field, said method comprising the steps of: 

(a) passing a firs! template having a first pattern similar to one of said 
distinctive features over said video field and comparing the same in order to preliminarily 
identify at least one possible distinctive feature of said object which could be either a 
correctly identified distinctive feature or one of a plurality of incorrectly identified features; 
and, 

(b) passing a second template having a second pattern different from said 
first template pattern over said possible distinctive feature and comparing the same in order 
to determine if said possible distinctive feature is at least one of said incorrectly Identified 
features, 

wherein said second pattern includes a first element of said distinctive feature which 
extends substantially completely across said second template. 

20. A pattern recognition method for recognizing an object having landmark 
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features as imaged in a video field, said method comprising the steps of: 

(a) correlating a first template having a first pattern similar to one of said 
landmark features with respect to said video field and generating a first correlation surface to 
preliminarily Identify at least one candidate landmark feature of said object which could be 
either a correctly identified landmark feature or one of a plurality of falsely identified landmark 
features; and, 

(b) correlating a second template having a second pattern wtth unlike 
feature correlation wtth" respect to said first template pattern to said candidate landmark 
feature and generating a second correlation surface to determine if said candidate landmark 
feature is at least one of said falsely identified landmark features, 

wherein said second pattern includes a first element of said landmark feature which 
extends substantially completely across said second template. 

21. A system for recognizing an object having landmark features as imaged in a 
video field, said system comprising: 

scanning means for scanning said object and forming a series of video fields; and, 

correlating means for correlating a first template having a first pattern similar to one of 
said landmark features with respect to said video field and generating a first correlation 
surface to preliminarily identify at least one candidate landmark feature of said object which 
could be either a correctly identified landmark feature or one of a plurality of falsely identified 
landmark features and for correlating a second template having a second pattern with unlike 
feature correlation with respect to said first template pattern to said candidate landmark 
feature and generating a second correlation surface to determine if said candidate landmark 
feature is at least one of said falsely identified landmark features, 

wherein said second pattern includes a first element of said landmark feature which 
extends substantially completely across said second template and wherein said correctly 
identified landmark feature is identified by process of eliminating possible falsely identified 
landmark features. 
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22. The system of claim 21 further comprising: 

insertion means for inserting an alternative image in place of said object in said video 

field. 

23. A pattern recognition method for recognizing an object having distinctive 
features as imaged in a video field, said method comprising the steps of: 

(a) passing a first template having a first pattern similar to one of said 
distinctive features over said video field and comparing the same in order to preliminarily 
identify at least one possible distinctive feature of said object which could be either a 
correctly Identified distinctive feature or one of a plurality of incorrectly identified features; 

(b) passing a second template having a second pattern different from said 
first template pattern over said posesbie distinctive feature and comparing the same in order 
to determine if said possible distinctive feature is at least one of said incorrectly identified 
features; 

(c) passing a third template having a third pattern different from said first 
and second template patterns over said possible distinctive features and comparing the 
same in order to determine If safd possible distinctive feature is at least another of said 
incorrectly identified features, 

wherein at least two templates are employed and a distinctive feature is determined 
to exist when the following total function T is maximized: 

T=£ [53 ffej n) ±£ Pot. (m+\r t n+h)] = maximum 



where T « Total or Sum of function 
P = a given line 

oti = angle of a given tine P 'o abscissa 
I = 2 or more 

m = location of at least a first search template on the vertical, i.e., ordinate, axis 



39 



AMENDED SHEET (ARTICLE 19) 



WO 96/24115 



PCT/US96/01125 



n = location of at least a first search template on the horizontal, i.e., abscissa, axis, 
v = vertical offset of at least a second search template from point m 
h = horizontal offset oi at least a second search template from point n. 

24. A pattern recognition method for recognizing an object having landmark 
features as imaged in a video field, said method comprising the steps of: 

(a) correlating a first template having a first pattern similar to one of said 
landmark features with "respect to said video field and generating a first correlation surface to 
preliminarily Identify at least one candidate landmark feature of said object which could be 
either a correctly identified landmark feature or one of a plurality of falsely identified landmark 
features; 

(b) correlating a second template having a second pattern with unlike 
feature correlation with respect to said first pattern to said candidate landmark feature and 
generating a second correlation surface to determine "if said landmark feature is at toast one 
of said falsely identified landmark features; 

(c) correlating a third template having a third pattern with unlike feature 
correlation with respect to said first and second template patterns over said candidate 
landmark feature and generating a third correlation surface in order to determine if said 
candidate landmark feature is at least another of said falsely identified landmark features, 

wherein at least two correlation surfaces are generated and a landmark feature is 
determine to exist when-lhe following total function is maximized: 

[53 A»i I** n)±£ P*; (m+v, n+h)] = maximum 

where T * Total or Sum of function 
P = a given line 

cij = angle of a given line P to abscissa 
i = 2 or more 
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m » location of at least a first search template on the vertical, i.e., ordinate, axis 
n = location of at least a first search template on the horizontal, I.e., abscissa, axis, 
v s vertical offset of at least a second search template from point m 
h = horizontal offset of at least a second search template from point n. 

25. A pattern recognition method for recognizing an object having distinctive 
features as imaged in a video field, said method comprising the steps of: 

(a) passing a first tamplate having a first pattern similar to one of said 
distinctive features over said video field and comparing the same in order to preliminarily 
identify at least one possible distinctive feature of said object which could be either a 
correctly identified distinctive feature or one of a plurality of incorrectly identified features; 

(b) passing a second template having a second pattern different from said 
first template pattern over said possible distinctive feature and comparing the same in order 
to determine if said possible distinctive feature is at least one of said incorrectly identified 
features, 

wherein said second pattern does not match any distinctive feature sought to be 
recognized and includes at least one extended element different from said first pattern in 
order to more rapidly distinguish said distinctive features. 

26. A pattern recognition method for recognizing an object having landmark 
features as imaged in a video field, said method comprising the steps of: 

(a) correlating a first template having a first pattern similar to one of said 
landmark features with respect to & video field and generating a first correlation surface to 
preliminarily identify at least one candidate landmark feature of said object which could be 
either a correctly identified landmark feature or one of a plurality of falsely identified landmark 
features; and, 

(b) correlating a second template having a second pattern with unlike 
feature correlation with respect to said first template pattern to said candidate landmark 
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feafcre and oenerating a second correlaKon surface to determine if said candidate iandmark 
feature is at least one of said falsely identified landmark features, 

wherein said second pattern does not match any landmark feature sought to be 
recognteed and includes at .east one extended element different from said first pattern in 
order to more rapidly distinguish said landmark features. 

27. a system for recognizing an object having landmark features as imaged in a 
video field, said system* comprising: 

scanning means tor scanning said object and forming a series of video fields- and 
correlating means for correlating a first template having a first pattern similar to one of 
sa,d landmark features with respect to said video fleW and generating a first correlation 
surface to preliminary identify at least one candidate landmark feature of said object which 
could be either a correct* ider M .andmark feature or one of a plurality of false* identified 
landmark features and for correlating a second tempiate having a second pattern with unlike 
feature cordon with respect to said first template pattern to said candidate landmark 
feature and generating a second correction surface to determine if said candidate landmark 
feature is at least one of said falsely identified landmark features, 

wherein said second pattern does not match any landmark feature sought to be 
recognized and indudes at least one extended element dfrferent from said first pattern In 
order to more rapidly distinguish said landmark features. 
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STATEMENT UNDER ARTICLE 19 



The replacement sheets containing the new and amended claims are the result of 
an office action received in a corresponding U.S. patent application. Applicant has 
amended the present claims in an effort to keep the prosecution of the PCI application 
consistent with the prosecution of the U.S. application which is the priority document for 
the present application. 

New claim 1 is a combination of original claims 1-4 and replaces same. Original 
claim 4 was indicated as having allowable subject matter in the U.S. application. Thus, a 
new independent claim incorporating all the limitations up to original claim 4 was drafted 
which would likewise contain allowable subject matter. Mew claims 2-9 are simply original 
claims 5-12 with slight amendments to provide greater clarity. These dependent claims all 
refer to new claim 1 and, as such, are all believed allowable 

New claim 10 is a combination of original claims 13-16 and replaces same. 
Original claim 16 was indicatea as having allowable subject matter in the U.S. application. 
Tnus a new independent claim incorporating all the limitations of original ciaims 13-16 
was drafted which would likewise contain allowable subject matter. New ciaims 1 1-18 are 
simply original claims 17-24 with slight amendments to provide greater clarity These 
dependent claims at! refer to new claim 10 and, as such, are a!! believed allowable. 

New claim 19 is a combination of original claims 1 and 4. New claim 20 is a 
combination of original ciaims 13 and 16. New claim 21 is similar to original claim 25 but 
includes language similar to original claim 4. New claim 22 depends from new claim 21 . 
New claim 23 is a combination of original claims 1,2 and 11. New claim 24 is a 
combination of original claims 13 and 23 and Includes language similar to original claim 4. 
New claim 25 expands on original claim 1. New claims 26-27 each include similar 
language to original claims 13 and 25 but include additional limitations. Since original 
claims 4 and 1 1 were indicated as containing allowable subject matter in the US 
application, it is believed that new claims incorporating language similar to original claims 4 
and 1 1 are likewise allowable. Further, applicant believes the additional limitations 
contained in the other new claims also renders them allowable. 
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